Portfolio Construction and testing using Cluster Analysis

 

Anju Ans Saji, Joshna Joys Maria Joseph, Dr Sathish Kumar B

Christ (Deemed to be University) Hosur Road, Bengaluru 560029, Karnataka, India

*Corresponding Author E-mail: joshna.joseph@mcom.christuniversity.in

 

ABSTRACT:

In this research article, we aim to build an efficient portfolio by using the following selected variables like Price to Earnings ratio, Total returns, Turnover, Price to Book value ratio and Dividend Yield of Nifty 50 companies. This paper has applied the K-mean clustering approach to construct an optimal portfolio that yields maximum returns. An optimal portfolio satisfies the need for diversifying the investment pattern to minimise risk. We have considered five variables between the periods of April 2015 to March 2018 to cluster these stocks. The efficiency of the three clusters formed was measured using the stock return value of each stock. The total of the stock returns in each cluster where calculated. After the evaluation of these total returns, the first cluster showed the maximum return when compared to the other two clusters.

 

KEYWORDS: Cluster analysis, K-means, Portfolio construction.

 

 


INTRODUCTION:

As stated by Harry Markowitz, there are two phases in the process of selecting a portfolio. The first one being observation and expectation of the future performance of the stocks while the second one is the selection of portfolio keeping in view the expected future performance (Markowitz, 1952). The investors can choose any combination from the stocks available to create their portfolio. Various financial assets grouped together based on the investor’s decision is known as a portfolio. Risk diversification is one of the major benefits to an investor when the investment is made on a wide range of assets.

 

The Industrial Policy of 1991 was one of the biggest reforms of the Indian economy post-independence. The main objective of this policy was to liberalise, privatise and globalize the Indian markets. The history of the Indian stock market can be traced back to the year 1875 to the formation of the Bombay Stock Exchange (BSE). Lack of transparency, inconvenience in trading, inconsistency and delay in adapting to growing technology was some of the defects of the Bombay Stock Exchange. The National Stock Exchange (NSE) came into existence by overcoming all the challenges faced by the Bombay Stock Exchange.

 

Growth of these stock markets is necessary for the development of the Indian economy. Trading in these stock markets has increased phenomenally when compared to the previous decades. This is due to the ease of accessibility, knowledge and regulations that have developed in the markets as well as among the investors. Individual investors and institutional investors are encouraged to invest in various stocks by providing them tax benefits, various savings options, retirement benefits, means to avoid unnecessary spending, good returns etc.

 

The objective of this paper is to help investors in decision making during the selection and construction of their portfolios. Selection of an optimal portfolio is essential for yielding maximum returns. In most cases, the decisions resulting in the selection of these portfolios are based only on the prima facie evidence. Since the performance of the portfolios are based on several factors other than that those considered primarily, the decision based on prima facie evidence may not yield expected returns. We aim at bringing out an efficient portfolio with the help of cluster analysis. Clustering can be defined as “a mathematical technique designed for revealing classification structures in the data collected in the real-world phenomena”(Mirkin, 1996). In this paper, we use this powerful data mining tool to group our variables into smaller clusters based on the homogeneity of the data.

 

LITERATURE REVIEW:

Demutualization of stock exchanges has increased the quality of the trading transaction. This makes the National Stock Exchange (NSE) perform better when compared to the Bombay Stock Exchange (BSE) (Krishnamurti, Sequeira, and Fangjian, 2003). In agreement with this, Nifty 50 serves several purposes as it is a diversified stock index. In line with the above statement, portfolio creation of Nifty 50 is more relevant as it is considered to be much more stable than the other indices in India (Sathyanarayana and Harish, 2017).

 

In a parallel view, selection of portfolios is influenced by the background risk that arises from different sources. It is one of the various portfolio risks that is required to be taken into consideration. In line with this, the different sources that can be listed are labour income, proprietary income and real estate (Jiang, Ma, and An, 2010). On one hand, the inclusion of various aspects of the stock’s characteristics will be a merit to the investors in terms of stock’s performance, risk and return (Ammann, Coqueret, and Schade, 2016). On the other hand, with the availability of various investment options, the risk incidental to these remains persistent. This makes the selection of an optimal portfolio for an investment a pragmatic problem even now (Mei and Nogales, 2018). In a slightly different focus, while considering the different investors in the market there can be two different categories, investors who trade with ambiguity and investors who avoid ambiguity. Taking this into consideration the various factors that are examined are trading with high transaction cost and returns with high inconsistencies (Zhang, Jin, and An, 2017).

 

Individual Investors tend to invest in a way that is not in accordance with most financial and portfolio theories. Their choices deviate to a larger extent from these theories and make portfolio selection difficult. This is because they consider only their homeland investment avenues and ignores foreign investment options and are overconfident about the data, they receive through financial brokerages (Jacobs, Müller, and Weber, 2014). The researchers also state that investing in various classes of asset is advantageous to individual investors as the risk diversification will be efficient (Jacobs, Müller, and Weber, 2014). In contrast to the above findings, it is said that overconfidence in case of individual investors result in taking more risk and trading in large quantities but this is not similar in case of institutional investors. Because they will be having more accurate information regarding the stocks than individual investors (Palomino and Sadrieh, 2011).

 

The income of the individual investors also possesses a stronghold on the decision to invest and develop a portfolio. If there is uncertainty in the income of the individuals then there is uncertainty in the market performance. Therefore, income is also considered as a risk factor in constructing portfolios(Bonaparte, Korniotis, and Alok, 2014). In relation to this, many individual investors seek guidance from professional services to develop a high return portfolio. These professionals are under greater pressure to provide a high return portfolio when compared to other professional services available in the market. Accordingly, investors seeking professional guidance are on a favourable side of receiving greater returns when compared to those who invest on their own (Huddart, 1999).

 

On one hand, the information regarding the performance of stocks will have an impact while selecting the portfolio. Depending on the stocks available to the investor’s, their investment decision will differ (Massa, Simonov, and Stenkrona, 2015). On the other hand, the time of making investments and the group of assets in which the investment is made will have an impact on the performance of the portfolio (Ciccotello, Greene, Ling, and Rakowski, 2011). In parallel to this, reallocation of portfolio stocks from one professional to other increases the variability of the returns from these portfolios. These reallocations create fluctuations in the overall market returns and affect individual investors (Anand, Chakravarty, and Chuwonganant, 2009). To elaborate further, taking into consideration the price, volume and various other factors help to predict the future stock return. This will enhance the predictability of excess return on risk-free rates. The factors that cause variation in the stock return cannot be identified every time. But in order to identify these factors, we can use various ratios like dividend pay-out ratio, book-to-market ratio and dividend yield ratio (Lin, 2018). Stocks that have extreme returns will have high fluctuations which will affect their performance in the stock market. The inclusion of these stocks in the portfolio will affect its entire performance. So, the removal of these stocks from the portfolio helps to increase its performance and its returns (Yang and Zhang, 2019). In the same vein, risk and diversification are inter-dependent. Since the risk is diversified across the portfolio, if the portfolio is under-diversified, risk will be high and if it is properly diversified risk will be minimum (Merkle, 2018).

 

In a different sphere, cluster analysis is a method which is used to group the stocks based on the similar features they possess. The advantage of using cluster analysis is that objects with near features can be analysed together and clusters with different objects can be analysed separately (L, GY, and N, 1981). The other benefit that can be counted on the use of cluster analysis is that it reduces the time required in the construction of a portfolio (Nanda, Mahanty, and Tiwari, 2010). Irregularities between classifications such as style or objective and the achievement of the performance exist among various investments. Also, historic trends have to be analysed before clustering the data because the performance of each cluster varies depending upon the classification pattern (Sidana and Acharya, 2007). Credit risk is also another factor that can be considered while clustering (Klotz and Lindermeir, 2015). Adding on, another factor that affects the performance of each cluster is the uncertainty in the price that exists in the stock market when the trading begins on a particular day (Ohta, 2006). On the contrary, by the end of a trading day, the uncertainties in the stock market reduce due to lower chances of clustering based on trading size (Garvey and Wu, 2014). Firms that have common systematic risk has greater capacity when grouped and analysed through clusters. In case of a dilemma in identifying attributes of the selected variables, cluster analysis provides aid in selecting the proxies (Ingram and Margetis, 2010). When two variables from a larger group are taken as small samples and linked together the resulting performance can be jointly determined (Gonzalez and Rubio, 2017). Relating to this, the reliability and profitability of the stocks in the portfolios formed can be understood through cluster analysis (Klotz and Lindermeir, 2015).

 

By reviewing various literature, it is evident that portfolio selection and construction in the Indian stock market has not been studied largely. While there has been a study conducted on ‘Clustering Indian stock market data for portfolio management’ that focuses on the stocks of Bombay Stock Exchange (BSE), studies specific to stocks in the National Stock Exchange (NSE) has not been conducted (Nanda, Mahanty, and Tiwari, 2010). Hence, this becomes the gap that we are trying to address.

 

OBJECTIVE:

The research is carried out with the following objectives:

·         To construct an efficient portfolio using clusters analysis based on the selected variables.

·         To test these constructed portfolios and measure its efficiency.

 

METHODOLOGY:

The stock performance data of Nifty50 companies listed in the National Stock Exchange for a period of three years are analysed using clustering technique with the help of SPSS. We have used K-mean clustering for analysing our data. Since K-mean is a non-hierarchical method, the number of clusters that need to be formed is derived based on human judgement. It was noticed that increasing the number of clusters by more than three, the companies were split and divided into the extra clusters added from the third cluster. Therefore, the number of clusters is fixed to be three. K-mean defines several targets known as K, which is considered to be the centre of that particular cluster. Therefore, each cluster will have values that are closer to the central value (J.Garbade, 2018).

 

Data Description:

The data set collected for this study is from Prowessiq. We have taken data for all Nifty 50 companies for three years commencing from 1st April 2015 to 31st March 2018 based on the variable mentioned below.

 

Variables:

In this paper we have considered seven variables:

Variables

Period (1/04/2015-31/03/2018)

 P/E ratio

3 years annual data

Total returns

3 years annual data

Turnover

3 years annual data

P/B ratio

3 years annual data

Dividend Yield

3 years annual data

 

After running an outlier test for each variable using Boxplot, there were six outliers that were identified and removed. Therefore, for the purpose of this study, 44 individual stocks are analysed to construct an efficient portfolio. The stocks that were removed from the data set of Nifty50 companies are Bajaj Finance Ltd., Bajaj Finserv Ltd., Hindustan Unilever Ltd., Vedanta Ltd., Reliance Industries Ltd and Housing Development Finance Corpn. Ltd.

 

Price-earnings ratio shows the potentiality of the company to earn profits based on market performance. Higher price-earnings ratio means that the investors are expecting the company to grow further in the future (Sha, 2017). Total return here refers to capital appreciation as well as the return received from a portfolio of investments. The dividend yield is the total amount of dividend paid by a company during the past years divided by the stock price prevailing at the beginning of the same year. This also tells us the variation in the dividend paid based on the fluctuations in the earnings of the company (Morgan and Thomas, 1998). The turnover ratio measures the relationship between the assets and liabilities of a company with its total sales. This means how much assets and liabilities can be replaced using revenue from sales (Bragg, 2018). P/B ratio shows the companies present market value to the book value. This helps an investor to understand whether the stock is undervalued or overvalued.

 

Table 1. The initial cluster centres

Initial Cluster Centres

Cluster

1

2

3

Average - P/E ratio

66.86

60.813

37.42

Average - P/B ratio

13.933

14.103

1.397

Average – Yield

0.203

0.753

0.643

Average – Returns

0.617

-0.307

-0.13

Average – Turnover

4790.93

1141.44

8282.11

Average - Total Debt/ Equity

0.293

0.005

1.48

 

The above table shows the K-mean centre values for each variable in the three clusters. Those companies whose variable values are around the central value are grouped into one cluster. Thus, this remains the initial structure for the cluster to be formed. Here there are negative values in the second and third cluster which needs to be optimised through iteration.

An optimal cluster can be obtained by eliminating the least prominent clusters by using iterations (Nanda, Mahanty, and Tiwari, 2010).

 

Table 2. Iteration History

Iteration

Change in Cluster Centres

1

2

3

1

245.813

723.066

704.274

2

212.708

45.006

419.120

3

286.466

45.006

167.128

4

181.045

.000

164.417

5

.000

.000

.000

 

An iteration table shows the optimal clusters after various progressions are run using K-mean. The above iteration table (Table 2) shows that after the fourth iteration there are no changes in the clusters and hence it has reached its inundation and also the negative values have been eliminated.


 

Table 3. Individual stocks falling under each cluster after iteration.

CLUSTER 1

CLUSTER 2

CLUSTER 3

 

Company Name

Distance

Company Name

Distance

Company Name

Distance

Grasim Industries Ltd.

356.856

Adani Ports and Special Economic Zone Ltd.

390.923

I C I C I Bank Ltd.

132.971

H D F C Bank Ltd.

981.883

Asian Paints Ltd.

723.066

Indiabulls Housing Finance Ltd.

689.605

I T C Ltd.

742.353

Axis Bank Ltd.

722.663

Infosys Ltd.

328.232

Indian Oil Corpn. Ltd.

307.193

Bajaj Auto Ltd.

306.513

Maruti Suzuki India Ltd.

990.058

Indusind Bank Ltd.

345.564

Bharat Petroleum Corpn. Ltd.

102.425

State Bank of India

1454.935

Kotak Mahindra Bank Ltd.

30.423

Bharti Airtel Ltd.

611.265

Tata Consultancy Services Ltd.

838.351

Larsen and Toubro Ltd.

470.121

Bharti Infratel Ltd.

691.446

Tata Steel Ltd.

685.285

Tata Motors Ltd.

626.525

Britannia Industries Ltd.

591.078

Yes Bank Ltd.

1150.92

Titan Company Ltd.

924.799

Cipla Ltd.

558.778

 

 

 

 

Coal India Ltd.

577.981

 

 

 

 

Dr. Reddy'S Laboratories Ltd.

610.639

 

 

 

 

Eicher Motors Ltd.

139.605

 

 

 

G A I L (India) Ltd.

77.614

 

 

 

 

H C L Technologies Ltd.

530.632

 

 

 

 

Hero Motocorp Ltd.

673.975

 

 

 

 

Hindalco Industries Ltd.

868.317

 

 

 

 

J S W Steel Ltd.

262.028

 

 

 

 

Mahindra and Mahindra Ltd.

301.903

 

 

 

 

N T P C Ltd.

939.8

 

 

 

 

Oil and Natural Gas Corpn. Ltd.

534.164

 

 

 

 

Tech Mahindra Ltd.

337.08

 

 

 

 

U P L Ltd.

204.536

 

 

 

 

Ultratech Cement Ltd.

359.213

 

 

 

 

Wipro Ltd.

674.37

 

 

 

 

Zee Entertainment Enterprises Ltd.

111.96

 

 

 

Table 3. Testing the Clusters

Cluster 1

Return %

Cluster 2

Return %

Cluster 3

Return %

Grasim Industries Ltd.

-0.024

Adani Ports and Special Economic Zone Ltd.

-0.0016

I C I C I
Bank
Ltd.

-0.029

H D F C Bank Ltd.

0.108

Asian
Paints
Ltd.

0.00453

Indiabulls
Housing
Finance
Ltd.

0.073

I T C Ltd.

-0.052

Axis
Bank
Ltd.

-0.01

Infosys
Ltd.

0.037

Indian Oil Corpn. Ltd.

-0.466

Bajaj
Auto
Ltd.

-0.013

Maruti
Suzuki
India
Ltd.

0.147

Indusind Bank Ltd.

0.094

Bharat
Petroleum
Corpn.
Ltd.

-0.216

State
Bank
Of
India

-0.091

Kotak Mahindra Bank Ltd.

0.067

Bharti Airtel Ltd.

0.044

Tata Consultancy Services Ltd.

0.059

Larsen and Toubro Ltd.

-0.141

Bharti Infratel Ltd.

-0.012

Tata Steel Ltd.

-7.819

Tata Motors Ltd.

-0.164

Britannia Industries Ltd.

0.149

Yes Bank Ltd.

-1.648

Titan Company Ltd.

0.266

Cipla Ltd.

-0.046

Coal India Ltd.

-0.027

Dr. Reddy'S Laboratories Ltd.

-0.13

Eicher Motors Ltd.

0.031

G A I L (India) Ltd.

-0.096

H C L Technologies Ltd.

0.037

Hero Motocorp Ltd.

0.034

Hindalco Industries Ltd.

0.024

 

 

J S W Steel Ltd.

0.155

Mahindra and Mahindra Ltd.

-0.375

N T P C Ltd.

0.006

Oil and Natural Gas Corpn. Ltd.

-0.029

Power Grid Corpn. Of India Ltd.

-0.012

Sun Pharmaceutical Inds. Ltd.

-0.158

Tech Mahindra Ltd.

0.124

U P L Ltd.

-0.016

Ultratech Cement Ltd.

-0.017

Wipro Ltd.

-0.38

Zee Entertainment Enterprises Ltd.

0.024

SUM

-0.312

SUM

-0.902

SUM

-9.271

 


The above table shows the list of companies that fall under each cluster after the iteration. Cluster one shows the most optimal portfolio in terms of return for an investor. It consists of 9 stocks out of the 44 company stocks that were considered for the analysis. Cluster two is the second most efficient portfolio which includes 27 companies from which the investor can yield maximum return. Cluster three comparatively forecast a lower return.

 

The above table substantiates the portfolios that were created using cluster analysis. It exhibits the average return for each stock for the period 2017-2018. From the total of each cluster, it is seen that cluster 1 yields the maximum return followed by cluster 2 and 3.

 

CONCLUSION:

From this study, we conclude that the selection of an optimal portfolio is necessary to yield maximum returns. The use of clustering technique benefits an investor in reducing time to make an investment decision since clustering, helps to group stocks of similar nature. K-mean clustering classified these stocks into 3 clusters based on the similarities of the variables considered and ranked them in accordance with the return and risk priorities. Priorities here mean, maximum return with minimum risk. Thus, with the help of clustering we have identified cluster 1 to be the most optimal portfolio and have tested its efficiency with their respective market returns. This paper recommends an optimal portfolio for an individual investor to reap good returns.

 

REFERENCES:

1.        Ammann, M., Coqueret, G., and Schade, J.-P. (2016). Characteristics-based portfolio choice with leverage constraints. Journal of Banking and Finance, 70, 23-37.

2.        Anand, A., Chakravarty, S., and Chuwonganant, C. (2009). Cleaning house: Stock reassignments on the NYSE. Journal of Financial Markets, 12, 727-753.

3.        Bock, T. (2018, March 28). What is Hierarchical Clustering? Retrieved from DIisplayr Blog: https://www.displayr.com/what-is-hierarchical-clustering/

4.        Bonaparte, Y., Korniotis, G. M., and Alok. (2014). Income hedging and portfolio decisions. Journal of Financial Economics, 113(2), 300-324.

5.        Bragg, S. (2018, March 27). Turnover ratios. Retrieved from Accounting Tools: https://www.accountingtools.com/articles/ what-is-a-turnover-ratio.html

6.        Campbell, J. Y., and Vuolteenaho, T. (2004). Bad Beta, Good Beta. The American Economic Review, 94, 1249-1275.

7.        Ciccotello, C., Greene, J., Ling, L., and Rakowski, D. (2011). Capacity and factor timing effects in active portfolio management. Journal of Financial Markets, 14, 277-300.

8.        Garvey, R., and Wu, F. (2014). Clustering of intraday order sizes by uninformed traders versus informed traders. Journal of Banking and Finance, 41, 222-235.

9.        Gonzalez, A., and Rubio, G. (2017). The joint cross -sectional variation of equity returns and volitalities. Journal of Banking and Finance, 75, 17-34.

10.      Huddart, S. (1999). Reputation and performance fee e!ects on portfolio choice by investment advisers. Journal of Financial Markets, 2, 227-271.

11.      Ingram, M., and Margetis, S. (2010). A practical method to estimate the cost of equity capital for a firm using cluster analysis. Managerial Finance, 36(2), 160-167.

12.      J.Garbade, D. (2018, September 13). Understanding K-means Clustering in Machine Learning. Retrieved from Towards Data Science: https://towardsdatascience.com/understanding-k-means-clustering-in-machine-learning-6a6e67336aa1

13.      Jacobs, H., Müller, S., and Weber, M. (2014). How should individual investors diversify? An empirical evaluation of alternative asset allocation policies. Journal of Financial Markets, 19, 62-85.

14.      Jiang, C., Ma, Y., and An, Y. (2010). An analysis of portfolio selection with background risk. Journal of Banking and Finance, 34, 3055-3060.

15.      Klotz, S., and Lindermeir, A. (2015). Multivariate credit portfolio mangement using cluster analysis. Journal of Risk Finance, 16(2), 145-163.

16.      Krishnamurti, C., Sequeira, J. M., and Fangjian, F. (2003). Stock exchange governance and market quality. Journal of Banking and Finance, 27, 1859-1878.

17.      L, F., GY, M., and N, M. S. (1981). Cluster Analysis. Acta Oeconomica, 26, 291-334.

18.      Lin, Q. (2018). Technical Analysis and Stock Return Predictability:An Aligned Approach. Journal of Financial Markets, 38, 103-123.

19.      Markowitz, H. (1952). Porfolio Selection. The Journal of Finance, 7, 77-91.

20.      Massa, M., Simonov, A., and Stenkrona, A. (2015). Style representation and portfolio choice. Journal of Financial Markets, 23, 1-25.

21.      Mei, X., and Nogales, F. J. (2018). Portfolio Selection with Proportional Transaction Costs and Predictability. Journal of Banking and Finance, 94, 131-151.

22.      Merkle, C. (2018). The curious case of negative volatility. Journal of Financial Markets, 40, 92-108.

23.      Mirkin, B. (1996). Mathematical Classification and Clustering. Springer Science and Business Media.

24.      Morgan, G., and Thomas, S. (1998). Taxes, dividend yields and returns in the UK equity market. Journal of Banking and Finance, 22(4), 405-423.

25.      Nanda, S. R., Mahanty, B., and Tiwari, M. K. (2010). Clustering Indian stock market data for protfolio managament. Expert System with Applications, 37, 8793-8798.

26.      Ohta, W. (2006). An analysis of intraday patterns in price clustering on the Tokyo Stock Exchange. Journal of Banking and Finance, 30, 1023-1039.

27.      Palomino, F., and Sadrieh, A. (2011). Overconfidence and delegated portfolio management. Journal of Financial Intermediation, 20, 159-177.

28.      Sathyanarayana, S., and Harish, S. N. (2017). An Empirical study on stability beta in Indian stock market with special reference to CNX Nifty 50. Journal of financial risk management, 14, 16-35.

29.      Sha, T. L. (2017). Effects of Price Earnings Ratio, Earnings Per Share, Book to Market Ratio and Gross Domestic Product on Stock Prices of Property and Real Estate Companies in Indonesia Stock Exchange. International Journal of Economic Perspectives, 11, 1743-1754.

30.      Sidana, G., and Acharya, D. (2007). Classifying Mutual Funds In India: Some results from clustering. Indian Journal of Economics and Business, 6(1), 71-79.

31.      Yang, X., and Zhang, H. (2019). Extreme absolute strength of stocks and performance of momentum strategies. Journal of Financial Markets, 44, 71-90.

32.      Zhang, J., Jin, Z., and An, Y. (2017). Dynamic portfolio optimization with ambiguity aversion. Journal of Banking and Finance, 79, 95-109.

 

 

Received on 24.02.2020            Modified on 23.03.2020

Accepted on 29.04.2020           ©AandV Publications All right reserved

Asian Journal of Management. 2020;11(2):207-212.

DOI: 10.5958/2321-5763.2020.00032.3